Pesquisa | Portal Regional da BVS

1.

Optimizing selection based on BLUPs or BLUEs in multiple sets of genotypes differing in their population parameters.

Melchinger, Albrecht E; Fernando, Rohan; Melchinger, Andreas J; Schön, Chris-Carolin.

Theor Appl Genet ; 137(5): 104, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38622324

RESUMO

KEY MESSAGE: Selection response in truncation selection across multiple sets of candidates hinges on their post-selection proportions, which can deviate grossly from their initial proportions. For BLUPs, using a uniform threshold for all candidates maximizes the selection response, irrespective of differences in population parameters. Plant breeding programs typically involve multiple families from either the same or different populations, varying in means, genetic variances and prediction accuracy of BLUPs or BLUEs for true genetic values (TGVs) of candidates. We extend the classical breeder's equation for truncation selection from single to multiple sets of genotypes, indicating that the expected overall selection response ( Δ G Tot ) for TGVs depends on the selection response within individual sets and their post-selection proportions. For BLUEs, we show that maximizing Δ G Tot requires thresholds optimally tailored for each set, contingent on their population parameters. For BLUPs, we prove that Δ G Tot is maximized by applying a uniform threshold across all candidates from all sets. We provide explicit formulas for the origin of the selected candidates from different sets and show that their proportions before and after selection can differ substantially, especially for sets with inferior properties and low proportion. We discuss implications of these results for (a) optimum allocation of resources to training and prediction sets and (b) the need to counteract narrowing the genetic variation under genomic selection. For genomic selection of hybrids based on BLUPs of GCA of their parent lines, selecting distinct proportions in the two parent populations can be advantageous, if these differ substantially in the variance and/or prediction accuracy of GCA. Our study sheds light on the complex interplay of selection thresholds and population parameters for the selection response in plant breeding programs, offering insights into the effective resource management and prudent application of genomic selection for improved crop development.

Assuntos

Melhoramento Vegetal , Seleção Genética , Humanos , Melhoramento Vegetal/métodos , Genótipo , Plantas/genética , Genômica/métodos , Modelos Genéticos , Fenótipo

2.

Subgenome-aware analyses reveal the genomic consequences of ancient allopolyploid hybridizations throughout the cotton family.

Sun, Pengchuan; Lu, Zhiqiang; Wang, Zhenyue; Wang, Shang; Zhao, Kexin; Mei, Dong; Yang, Jiao; Yang, Yongzhi; Renner, Susanne S; Liu, Jianquan.

Proc Natl Acad Sci U S A ; 121(15): e2313921121, 2024 Apr 09.

Artigo em Inglês | MEDLINE | ID: mdl-38568968

RESUMO

Malvaceae comprise some 4,225 species in 243 genera and nine subfamilies and include economically important species, such as cacao, cotton, durian, and jute, with cotton an important model system for studying the domestication of polyploids. Here, we use chromosome-level genome assemblies from representatives of five or six subfamilies (depending on the placement of Ochroma) to differentiate coexisting subgenomes and their evolution during the family's deep history. The results reveal that the allohexaploid Helicteroideae partially derive from an allotetraploid Sterculioideae and also form a component of the allodecaploid Bombacoideae and Malvoideae. The ancestral Malvaceae karyotype consists of 11 protochromosomes. Four subfamilies share a unique reciprocal chromosome translocation, and two other subfamilies share a chromosome fusion. DNA alignments of single-copy nuclear genes do not yield the same relationships as inferred from chromosome structural traits, probably because of genes originating from different ancestral subgenomes. These results illustrate how chromosome-structural data can unravel the evolutionary history of groups with ancient hybrid genomes.

Assuntos

Genoma de Planta , Gossypium , Genoma de Planta/genética , Gossypium/genética , Genômica/métodos , Poliploidia , Cariótipo , Evolução Molecular

3.

Large-scale genome-wide SNP analysis reveals the rugged (and ragged) landscape of global ancestry, phylogeny, and demographic history in chicken breeds. / å¤§è§æ¨¡å¨åºå ç»SNPåææç¤ºäºé¸¡åç§çå¨çç¥åãç§ç¾¤åå±åç§ç¾¤åå²çå¤æ(åå¤æ ·)çéä¼ å¾è°±.

Dementieva, Natalia V; Shcherbakov, Yuri S; Stanishevskaya, Olga I; Vakhrameev, Anatoly B; Larkina, Tatiana A; Dysin, Artem P; Nikolaeva, Olga A; Ryabova, Anna E; Azovtseva, Anastasiia I; Mitrofanova, Olga V; Peglivanyan, Grigoriy K; Reinbach, Natalia R; Griffin, Darren K; Romanov, Michael N.

J Zhejiang Univ Sci B ; 25(4): 324-340, 2024 Apr 15.

Artigo em Inglês, Chinês | MEDLINE | ID: mdl-38584094

RESUMO

The worldwide chicken gene pool encompasses a remarkable, but shrinking, number of divergently selected breeds of diverse origin. This study was a large-scale genome-wide analysis of the landscape of the complex molecular architecture, genetic variability, and detailed structure among 49 populations. These populations represent a significant sample of the world's chicken breeds from Europe (Russia, Czech Republic, France, Spain, UK, etc.), Asia (China), North America (USA), and Oceania (Australia). Based on the results of breed genotyping using the Illumina 60K single nucleotide polymorphism (SNP) chip, a bioinformatic analysis was carried out. This included the calculation of heterozygosity/homozygosity statistics, inbreeding coefficients, and effective population size. It also included assessment of linkage disequilibrium and construction of phylogenetic trees. Using multidimensional scaling, principal component analysis, and ADMIXTURE-assisted global ancestry analysis, we explored the genetic structure of populations and subpopulations in each breed. An overall 49-population phylogeny analysis was also performed, and a refined evolutionary model of chicken breed formation was proposed, which included egg, meat, dual-purpose types, and ambiguous breeds. Such a large-scale survey of genetic resources in poultry farming using modern genomic methods is of great interest both from the viewpoint of a general understanding of the genetics of the domestic chicken and for the further development of genomic technologies and approaches in poultry breeding. In general, whole genome SNP genotyping of promising chicken breeds from the worldwide gene pool will promote the further development of modern genomic science as applied to poultry.

Assuntos

Galinhas , Genoma , Animais , Filogenia , Galinhas/genética , Genômica/métodos , Demografia , Polimorfismo de Nucleotídeo Único , Variação Genética

4.

LettuceDB: an integrated multi-omics database for cultivated lettuce.

Zhou, Wenhui; Yang, Tao; Zeng, Liucui; Chen, Jing; Wang, Yayu; Guo, Xing; You, Lijin; Liu, Yiqun; Du, Wensi; Yang, Fan; Hua, Cong; Cai, Jia; van Hintum, Theo; Liu, Huan; Gu, Ying; Wei, Xiaofeng; Wei, Tong.

Database (Oxford) ; 20242024 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-38557635

RESUMO

Crop genomics has advanced rapidly during the past decade, which generated a great abundance of omics data from multi-omics studies. How to utilize the accumulating data becomes a critical and urgent demand in crop science. As an attempt to integrate multi-omics data, we developed a database, LettuceDB (https://db.cngb.org/lettuce/), aiming to assemble multidimensional data for cultivated and wild lettuce germplasm. The database includes genome, variome, phenome, microbiome and spatial transcriptome. By integrating user-friendly bioinformatics tools, LettuceDB will serve as a one-stop platform for lettuce research and breeding in the future. Database URL: https://db.cngb.org/lettuce/.

Assuntos

Alface , Multiômica , Alface/genética , Melhoramento Vegetal , Genômica/métodos , Bases de Dados Genéticas

5.

Accuracy of Genomic prediction for fleece traits in Inner Mongolia Cashmere goats.

Yan, Xiaochun; Li, Jinquan; He, Libing; Chen, Oljibilig; Wang, Na; Wang, Shuai; Wang, Xiuyan; Wang, Zhiying; Su, Rui.

BMC Genomics ; 25(1): 349, 2024 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-38589806

RESUMO

The fleece traits are important economic traits of goats. With the reduction of sequencing and genotyping cost and the improvement of related technologies, genomic selection for goats has become possible. The research collect pedigree, phenotype and genotype information of 2299 Inner Mongolia Cashmere goats (IMCGs) individuals. We estimate fixed effects, and compare the estimates of variance components, heritability and genomic predictive ability of fleece traits in IMCGs when using the pedigree based Best Linear Unbiased Prediction (ABLUP), Genomic BLUP (GBLUP) or single-step GBLUP (ssGBLUP). The fleece traits considered are cashmere production (CP), cashmere diameter (CD), cashmere length (CL) and fiber length (FL). It was found that year of production, sex, herd and individual ages had highly significant effects on the four fleece traits (P < 0.01). All of these factors should be considered when the genetic parameters of fleece traits in IMCGs are evaluated. The heritabilities of FL, CL, CP and CD with ABLUP, GBLUP and ssGBLUP methods were 0.26 ~ 0.31, 0.05 ~ 0.08, 0.15 ~ 0.20 and 0.22 ~ 0.28, respectively. Therefore, it can be inferred that the genetic progress of CL is relatively slow. The predictive ability of fleece traits in IMCGs with GBLUP (56.18% to 69.06%) and ssGBLUP methods (66.82% to 73.70%) was significantly higher than that of ABLUP (36.73% to 41.25%). For the ssGBLUP method is significantly (29% ~ 33%) higher than that with ABLUP, and which is slightly (4% ~ 14%) higher than that of GBLUP. The ssGBLUP will be as an superiors method for using genomic selection of fleece traits in Inner Mongolia Cashmere goats.

Assuntos

Genoma , Cabras , Humanos , Animais , Cabras/genética , Genômica/métodos , Fenótipo , Genótipo , Modelos Genéticos

6.

A comprehensive benchmark of graph-based genetic variant genotyping algorithms on plant genomes for creating an accurate ensemble pipeline.

Du, Ze-Zhen; He, Jia-Bao; Jiao, Wen-Biao.

Genome Biol ; 25(1): 91, 2024 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-38589937

RESUMO

BACKGROUND: Although sequencing technologies have boosted the measurement of the genomic diversity of plant crops, it remains challenging to accurately genotype millions of genetic variants, especially structural variations, with only short reads. In recent years, many graph-based variation genotyping methods have been developed to address this issue and tested for human genomes. However, their performance in plant genomes remains largely elusive. Furthermore, pipelines integrating the advantages of current genotyping methods might be required, considering the different complexity of plant genomes. RESULTS: Here we comprehensively evaluate eight such genotypers in different scenarios in terms of variant type and size, sequencing parameters, genomic context, and complexity, as well as graph size, using both simulated and real data sets from representative plant genomes. Our evaluation reveals that there are still great challenges to applying existing methods to plants, such as excessive repeats and variants or high resource consumption. Therefore, we propose a pipeline called Ensemble Variant Genotyper (EVG) that can achieve better genotyping performance in almost all experimental scenarios and comparably higher genotyping recall and precision even using 5× reads. Furthermore, we demonstrate that EVG is more robust with an increasing number of graphed genomes, especially for insertions and deletions. CONCLUSIONS: Our study will provide new insights into the development and application of graph-based genotyping algorithms. We conclude that EVG provides an accurate, unbiased, and cost-effective way for genotyping both small and large variations and will be potentially used in population-scale genotyping for large, repetitive, and heterozygous plant genomes.

Assuntos

Algoritmos , Benchmarking , Humanos , Genótipo , Genômica/métodos , Técnicas de Genotipagem/métodos , Genoma de Planta , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos

7.

Genomic evidence for human-mediated introgressive hybridization and selection in the developed breed.

Du, Heng; Liu, Zhen; Lu, Shi-Yu; Jiang, Li; Zhou, Lei; Liu, Jian-Feng.

BMC Genomics ; 25(1): 331, 2024 Apr 02.

Artigo em Inglês | MEDLINE | ID: mdl-38565992

RESUMO

BACKGROUND: The pig (Sus Scrofa) is one of the oldest domesticated livestock species that has undergone extensive improvement through modern breeding. European breeds have advantages in lean meat development and highly-productive body type, whereas Asian breeds possess extraordinary fat deposition and reproductive performance. Consequently, Eurasian breeds have been extensively used to develop modern commercial breeds for fast-growing and high prolificacy. However, limited by the sequencing technology, the genome architecture of some nascent developed breeds and the human-mediated impact on their genomes are still unknown. RESULTS: Through whole-genome analysis of 178 individuals from an Asian locally developed pig breed, Beijing Black pig, and its two ancestors from two different continents, we found the pervasive inconsistent gene trees and species trees across the genome of Beijing Black pig, which suggests its introgressive hybrid origin. Interestingly, we discovered that this developed breed has more genetic relationships with European pigs and an unexpected introgression from Asian pigs to this breed, which indicated that human-mediated introgression could form the porcine genome architecture in a completely different type compared to native introgression. We identified 554 genomic regions occupied 63.30 Mb with signals of introgression from the Asian ancestry to Beijing Black pig, and the genes in these regions enriched in pathways associated with meat quality, fertility, and disease-resistant. Additionally, a proportion of 7.77% of genomic regions were recognized as regions that have been under selection. Moreover, combined with the results of a genome-wide association study for meat quality traits in the 1537 Beijing Black pig population, two important candidate genes related to meat quality traits were identified. DNAJC6 is related to intramuscular fat content and fat deposition, and RUFY4 is related to meat pH and tenderness. CONCLUSIONS: Our research provides insight for analyzing the origins of nascent developed breeds and genome-wide selection remaining in the developed breeds mediated by humans during modern breeding.

Assuntos

Introgressão Genética , Estudo de Associação Genômica Ampla , Humanos , Animais , Suínos/genética , Genoma , Genômica/métodos , Cruzamento , Polimorfismo de Nucleotídeo Único , Sus scrofa/genética , Seleção Genética

8.

MMGAT: a graph attention network framework for ATAC-seq motifs finding.

Wu, Xiaotian; Hou, Wenju; Zhao, Ziqi; Huang, Lan; Sheng, Nan; Yang, Qixing; Zhang, Shuangquan; Wang, Yan.

BMC Bioinformatics ; 25(1): 158, 2024 Apr 20.

Artigo em Inglês | MEDLINE | ID: mdl-38643066

RESUMO

BACKGROUND: Motif finding in Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) data is essential to reveal the intricacies of transcription factor binding sites (TFBSs) and their pivotal roles in gene regulation. Deep learning technologies including convolutional neural networks (CNNs) and graph neural networks (GNNs), have achieved success in finding ATAC-seq motifs. However, CNN-based methods are limited by the fixed width of the convolutional kernel, which makes it difficult to find multiple transcription factor binding sites with different lengths. GNN-based methods has the limitation of using the edge weight information directly, makes it difficult to aggregate the neighboring nodes' information more efficiently when representing node embedding. RESULTS: To address this challenge, we developed a novel graph attention network framework named MMGAT, which employs an attention mechanism to adjust the attention coefficients among different nodes. And then MMGAT finds multiple ATAC-seq motifs based on the attention coefficients of sequence nodes and k-mer nodes as well as the coexisting probability of k-mers. Our approach achieved better performance on the human ATAC-seq datasets compared to existing tools, as evidenced the highest scores on the precision, recall, F1_score, ACC, AUC, and PRC metrics, as well as finding 389 higher quality motifs. To validate the performance of MMGAT in predicting TFBSs and finding motifs on more datasets, we enlarged the number of the human ATAC-seq datasets to 180 and newly integrated 80 mouse ATAC-seq datasets for multi-species experimental validation. Specifically on the mouse ATAC-seq dataset, MMGAT also achieved the highest scores on six metrics and found 356 higher-quality motifs. To facilitate researchers in utilizing MMGAT, we have also developed a user-friendly web server named MMGAT-S that hosts the MMGAT method and ATAC-seq motif finding results. CONCLUSIONS: The advanced methodology MMGAT provides a robust tool for finding ATAC-seq motifs, and the comprehensive server MMGAT-S makes a significant contribution to genomics research. The open-source code of MMGAT can be found at https://github.com/xiaotianr/MMGAT , and MMGAT-S is freely available at https://www.mmgraphws.com/MMGAT-S/ .

Assuntos

Sequenciamento de Cromatina por Imunoprecipitação , Genômica , Humanos , Animais , Camundongos , Sítios de Ligação , Ligação Proteica , Genômica/métodos , Cromatina/genética , Fatores de Transcrição/metabolismo

9.

TMBstable: a variant caller controls performance variation across heterogeneous sequencing samples.

Wang, Shenjie; Zhu, Xiaoyan; Wang, Xuwen; Liu, Yuqian; Zhao, Minchao; Chang, Zhili; Wang, Xiaonan; Shao, Yang; Wang, Jiayin.

Brief Bioinform ; 25(3)2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38632951

RESUMO

In cancer genomics, variant calling has advanced, but traditional mean accuracy evaluations are inadequate for biomarkers like tumor mutation burden, which vary significantly across samples, affecting immunotherapy patient selection and threshold settings. In this study, we introduce TMBstable, an innovative method that dynamically selects optimal variant calling strategies for specific genomic regions using a meta-learning framework, distinguishing it from traditional callers with uniform sample-wide strategies. The process begins with segmenting the sample into windows and extracting meta-features for clustering, followed by using a pre-trained meta-model to select suitable algorithms for each cluster, thereby addressing strategy-sample mismatches, reducing performance fluctuations and ensuring consistent performance across various samples. We evaluated TMBstable using both simulated and real non-small cell lung cancer and nasopharyngeal carcinoma samples, comparing it with advanced callers. The assessment, focusing on stability measures, such as the variance and coefficient of variation in false positive rate, false negative rate, precision and recall, involved 300 simulated and 106 real tumor samples. Benchmark results showed TMBstable's superior stability with the lowest variance and coefficient of variation across performance metrics, highlighting its effectiveness in analyzing the counting-based biomarker. The TMBstable algorithm can be accessed at https://github.com/hello-json/TMBstable for academic usage only.

Assuntos

Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genômica/métodos , Genoma , Algoritmos

10.

Statistical sampling of missing environmental variables improves biophysical genomic prediction in wheat.

Jighly, Abdulqader; Thayalakumaran, Thabo; Kant, Surya; Panozzo, Joe; Aggarwal, Rajat; Hessel, David; Forrest, Kerrie L; Technow, Frank; Totir, Radu; Goddard, Mike; Pryce, Jennie; Hayden, Matthew J; Munkvold, Jesse; O'Leary, Garry J.

Theor Appl Genet ; 137(5): 108, 2024 Apr 18.

Artigo em Inglês | MEDLINE | ID: mdl-38637355

RESUMO

KEY MESSAGE: The integration of genomic prediction with crop growth models enabled the estimation of missing environmental variables which improved the prediction accuracy of grain yield. Since the invention of whole-genome prediction (WGP) more than two decades ago, breeding programmes have established extensive reference populations that are cultivated under diverse environmental conditions. The introduction of the CGM-WGP model, which integrates crop growth models (CGM) with WGP, has expanded the applications of WGP to the prediction of unphenotyped traits in untested environments, including future climates. However, CGMs require multiple seasonal environmental records, unlike WGP, which makes CGM-WGP less accurate when applied to historical reference populations that lack crucial environmental inputs. Here, we investigated the ability of CGM-WGP to approximate missing environmental variables to improve prediction accuracy. Two environmental variables in a wheat CGM, initial soil water content (InitlSoilWCont) and initial nitrate profile, were sampled from different normal distributions separately or jointly in each iteration within the CGM-WGP algorithm. Our results showed that sampling InitlSoilWCont alone gave the best results and improved the prediction accuracy of grain number by 0.07, yield by 0.06 and protein content by 0.03. When using the sampled InitlSoilWCont values as an input for the traditional CGM, the average narrow-sense heritability of the genotype-specific parameters (GSPs) improved by 0.05, with GNSlope, PreAnthRes, and VernSen showing the greatest improvements. Moreover, the root mean square of errors for grain number and yield was reduced by about 7% for CGM and 31% for CGM-WGP when using the sampled InitlSoilWCont values. Our results demonstrate the advantage of sampling missing environmental variables in CGM-WGP to improve prediction accuracy and increase the size of the reference population by enabling the utilisation of historical data that are missing environmental records.

Assuntos

Melhoramento Vegetal , Triticum , Triticum/genética , Genoma , Genômica/métodos , Genótipo , Fenótipo , Grão Comestível/genética , Modelos Genéticos

11.

Genome assembly of Melilotus officinalis provides a new reference genome for functional genomics.

Meng, Aoran; Li, Xinru; Li, Zhiguang; Miao, Fuhong; Ma, Lichao; Li, Shuo; Sun, Wenfei; Huang, Jianwei; Yang, Guofeng.

BMC Genom Data ; 25(1): 37, 2024 Apr 18.

Artigo em Inglês | MEDLINE | ID: mdl-38637749

RESUMO

BACKGROUND: Sweet yellow clover (Melilotus officinalis) is a diploid plant (2n = 16) that is native to Europe. It is an excellent legume forage. It can both fix nitrogen and serve as a medicine. A genome assembly of Melilotus officinalis that was collected from Best corporation in Beijing is available based on Nanopore sequencing. The genome of Melilotus officinalis was sequenced, assembled, and annotated. RESULTS: The latest PacBio third generation HiFi assembly and sequencing strategies were used to produce a Melilotus officinalis genome assembly size of 1,066 Mbp, contig N50 = 5 Mbp, scaffold N50 = 130 Mbp, and complete benchmarking universal single-copy orthologs (BUSCOs) = 96.4%. This annotation produced 47,873 high-confidence gene models, which will substantially aid in our research on molecular breeding. A collinear analysis showed that Melilotus officinalis and Medicago truncatula shared conserved synteny. The expansion and contraction of gene families showed that Melilotus officinalis expanded by 565 gene families and shrank by 56 gene families. The contacted gene families were associated with response to stimulus, nucleotide binding, and small molecule binding. Thus, it is related to a family of genes associated with peptidase activity, which could lead to better stress tolerance in plants. CONCLUSIONS: In this study, the latest PacBio technology was used to assemble and sequence the genome of the Melilotus officinalis and annotate its protein-coding genes. These results will expand the genomic resources available for Melilotus officinalis and should assist in subsequent research on sweet yellow clover plants.

Assuntos

Medicago truncatula , Melilotus , Genômica/métodos , Tamanho do Genoma , Sintenia

12.

Measuring, visualizing, and diagnosing reference bias with biastools.

Lin, Mao-Jan; Iyer, Sheila; Chen, Nae-Chyun; Langmead, Ben.

Genome Biol ; 25(1): 101, 2024 Apr 19.

Artigo em Inglês | MEDLINE | ID: mdl-38641647

RESUMO

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.

Assuntos

Genoma , Genômica , Genômica/métodos , Biologia Computacional , Mutação INDEL , Viés , Análise de Sequência de DNA/métodos , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos

13.

Understanding genetic variability: exploring large-scale copy number variants through non-invasive prenatal testing in European populations.

Holesova, Zuzana; Pös, Ondrej; Gazdarica, Juraj; Kucharik, Marcel; Budis, Jaroslav; Hyblova, Michaela; Minarik, Gabriel; Szemes, Tomas.

BMC Genomics ; 25(1): 366, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38622538

RESUMO

Large-scale copy number variants (CNVs) are structural alterations in the genome that involve the duplication or deletion of DNA segments, contributing to genetic diversity and playing a crucial role in the evolution and development of various diseases and disorders, as they can lead to the dosage imbalance of one or more genes. Massively parallel sequencing (MPS) has revolutionized the field of genetic analysis and contributed significantly to routine clinical diagnosis and screening. It offers a precise method for detecting CNVs with exceptional accuracy. In this context, a non-invasive prenatal test (NIPT) based on the sequencing of cell-free DNA (cfDNA) from pregnant women's plasma using a low-coverage whole genome MPS (WGS) approach represents a valuable source for population studies. Here, we analyzed genomic data of 12,732 pregnant women from the Slovak (9,230), Czech (1,583), and Hungarian (1,919) populations. We identified 5,062 CNVs ranging from 200 kbp and described their basic characteristics and differences between the subject populations. Our results suggest that re-analysis of sequencing data from routine WGS assays has the potential to obtain large-scale CNV population frequencies, which are not well known and may provide valuable information to support the classification and interpretation of this type of genetic variation. Furthermore, this could contribute to expanding knowledge about the central European genome without investing in additional laboratory work, as NIPTs are a relatively widely used screening method.

Assuntos

Ácidos Nucleicos Livres , Variações do Número de Cópias de DNA , Gravidez , Feminino , Humanos , Diagnóstico Pré-Natal/métodos , Sequenciamento Completo do Genoma/métodos , Genômica/métodos , Testes Genéticos

14.

Earl Grey: A Fully Automated User-Friendly Transposable Element Annotation and Analysis Pipeline.

Baril, Tobias; Galbraith, James; Hayward, Alex.

Mol Biol Evol ; 41(4)2024 Apr 02.

Artigo em Inglês | MEDLINE | ID: mdl-38577785

RESUMO

Transposable elements (TEs) are major components of eukaryotic genomes and are implicated in a range of evolutionary processes. Yet, TE annotation and characterization remain challenging, particularly for nonspecialists, since existing pipelines are typically complicated to install, run, and extract data from. Current methods of automated TE annotation are also subject to issues that reduce overall quality, particularly (i) fragmented and overlapping TE annotations, leading to erroneous estimates of TE count and coverage, and (ii) repeat models represented by short sections of total TE length, with poor capture of 5' and 3' ends. To address these issues, we present Earl Grey, a fully automated TE annotation pipeline designed for user-friendly curation and annotation of TEs in eukaryotic genome assemblies. Using nine simulated genomes and an annotation of Drosophila melanogaster, we show that Earl Grey outperforms current widely used TE annotation methodologies in ameliorating the issues mentioned above while scoring highly in benchmarking for TE annotation and classification and being robust across genomic contexts. Earl Grey provides a comprehensive and fully automated TE annotation toolkit that provides researchers with paper-ready summary figures and outputs in standard formats compatible with other bioinformatics tools. Earl Grey has a modular format, with great scope for the inclusion of additional modules focused on further quality control and tailored analyses in future releases.

Assuntos

Elementos de DNA Transponíveis , Drosophila melanogaster , Animais , Elementos de DNA Transponíveis/genética , Anotação de Sequência Molecular , Drosophila melanogaster/genética , Genômica/métodos , Biologia Computacional

15.

Drought and heat stress: insights into tolerance mechanisms and breeding strategies for pigeonpea improvement.

Bakala, Harmeet Singh; Devi, Jomika; Singh, Gurjeet; Singh, Inderjit.

Planta ; 259(5): 123, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38622376

RESUMO

MAIN CONCLUSION: Pigeonpea has potential to foster sustainable agriculture and resilience in evolving climate change; understanding bio-physiological and molecular mechanisms of heat and drought stress tolerance is imperative to developing resilience cultivars. Pigeonpea is an important legume crop that has potential resilience in the face of evolving climate scenarios. However, compared to other legumes, there has been limited research on abiotic stress tolerance in pigeonpea, particularly towards drought stress (DS) and heat stress (HS). To address this gap, this review delves into the genetic, physiological, and molecular mechanisms that govern pigeonpea's response to DS and HS. It emphasizes the need to understand how this crop combats these stresses and exhibits different types of tolerance and adaptation mechanisms through component traits. The current article provides a comprehensive overview of the complex interplay of factors contributing to the resilience of pigeonpea under adverse environmental conditions. Furthermore, the review synthesizes information on major breeding techniques, encompassing both conventional methods and modern molecular omics-assisted tools and techniques. It highlights the potential of genomics and phenomics tools and their pivotal role in enhancing adaptability and resilience in pigeonpea. Despite the progress made in genomics, phenomics and big data analytics, the complexity of drought and heat tolerance in pigeonpea necessitate continuous exploration at multi-omic levels. High-throughput phenotyping (HTP) is crucial for gaining insights into perplexed interactions among genotype, environment, and management practices (GxExM). Thus, integration of advanced technologies in breeding programs is critical for developing pigeonpea varieties that can withstand the challenges posed by climate change. This review is expected to serve as a valuable resource for researchers, providing a deeper understanding of the mechanisms underlying abiotic stress tolerance in pigeonpea and offering insights into modern breeding strategies that can contribute to the development of resilient varieties suited for changing environmental conditions.

Assuntos

Secas , Fabaceae , Melhoramento Vegetal , Fabaceae/genética , Genômica/métodos , Resposta ao Choque Térmico

16.

The impact of FASTQ and alignment read order on structural variant calling from long-read sequencing data.

Lesack, Kyle J; Wasmuth, James D.

PeerJ ; 12: e17101, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38500526

RESUMO

Background: Structural variant (SV) calling from DNA sequencing data has been challenging due to several factors, including the ambiguity of short-read alignments, multiple complex SVs in the same genomic region, and the lack of "truth" datasets for benchmarking. Additionally, caller choice, parameter settings, and alignment method are known to affect SV calling. However, the impact of FASTQ read order on SV calling has not been explored for long-read data. Results: Here, we used PacBio DNA sequencing data from 15 Caenorhabditis elegans strains and four Arabidopsis thaliana ecotypes to evaluate the sensitivity of different SV callers on FASTQ read order. Comparisons of variant call format files generated from the original and permutated FASTQ files demonstrated that the order of input data affected the SVs predicted by each caller. In particular, pbsv was highly sensitive to the order of the input data, especially at the highest depths where over 70% of the SV calls generated from pairs of differently ordered FASTQ files were in disagreement. These demonstrate that read order sensitivity is a complex, multifactorial process, as the differences observed both within and between species varied considerably according to the specific combination of aligner, SV caller, and sequencing depth. In addition to the SV callers being sensitive to the input data order, the SAMtools alignment sorting algorithm was identified as a source of variability following read order randomization. Conclusion: The results of this study highlight the sensitivity of SV calling on the order of reads encoded in FASTQ files, which has not been recognized in long-read approaches. These findings have implications for the replication of SV studies and the development of consistent SV calling protocols. Our study suggests that researchers should pay attention to the input order sensitivity of read alignment sorting methods when analyzing long-read sequencing data for SV calling, as mitigating a source of variability could facilitate future replication work. These results also raise important questions surrounding the relationship between SV caller read order sensitivity and tool performance. Therefore, tool developers should also consider input order sensitivity as a potential source of variability during the development and benchmarking of new and improved methods for SV calling.

Assuntos

Algoritmos , Genômica , Genômica/métodos , Análise de Sequência de DNA/métodos , Genoma , DNA

17.

A method to correct for local alterations in DNA copy number that bias functional genomics assays applied to antibiotic-treated bacteria.

Sullivan, Geraldine J; Barquist, Lars; Cain, Amy K.

mSystems ; 9(4): e0066523, 2024 Apr 16.

Artigo em Inglês | MEDLINE | ID: mdl-38470252

RESUMO

Functional genomics techniques, such as transposon insertion sequencing and RNA-sequencing, are key to studying relative differences in bacterial mutant fitness or gene expression under selective conditions. However, certain stress conditions, mutations, or antibiotics can directly interfere with DNA synthesis, resulting in systematic changes in local DNA copy numbers along the chromosome. This can lead to artifacts in sequencing-based functional genomics data when comparing antibiotic treatment to an unstressed control. Further, relative differences in gene-wise read counts may result from alterations in chromosomal replication dynamics, rather than selection or direct gene regulation. We term this artifact "chromosomal location bias" and implement a principled statistical approach to correct it by calculating local normalization factors along the chromosome. These normalization factors are then directly incorporated into statistical analyses using standard RNA-sequencing analysis methods without modifying the read counts themselves, preserving important information about the mean-variance relationship in the data. We illustrate the utility of this approach by generating and analyzing a ciprofloxacin-treated transposon insertion sequencing data set in Escherichia coli as a case study. We show that ciprofloxacin treatment generates chromosomal location bias in the resulting data, and we further demonstrate that failing to correct for this bias leads to false predictions of mutant drug sensitivity as measured by minimum inhibitory concentrations. We have developed an R package and user-friendly graphical Shiny application, ChromoCorrect, that detects and corrects for chromosomal bias in read count data, enabling the application of functional genomics technologies to the study of antibiotic stress.IMPORTANCEAltered gene dosage due to changes in DNA replication has been observed under a variety of stresses with a variety of experimental techniques. However, the implications of changes in gene dosage for sequencing-based functional genomics assays are rarely considered. We present a statistically principled approach to correcting for the effect of changes in gene dosage, enabling testing for differences in the fitness effects or regulation of individual genes in the presence of confounding differences in DNA copy number. We show that failing to correct for these effects can lead to incorrect predictions of resistance phenotype when applying functional genomics assays to investigate antibiotic stress, and we provide a user-friendly application to detect and correct for changes in DNA copy number.

Assuntos

Antibacterianos , Variações do Número de Cópias de DNA , Antibacterianos/farmacologia , Variações do Número de Cópias de DNA/genética , Genômica/métodos , Elementos de DNA Transponíveis , Ciprofloxacina/farmacologia , Bactérias , RNA

18.

Reaction norm for genomic prediction of plant growth: modeling drought stress response in soybean.

Toda, Yusuke; Sasaki, Goshi; Ohmori, Yoshihiro; Yamasaki, Yuji; Takahashi, Hirokazu; Takanashi, Hideki; Tsuda, Mai; Kajiya-Kanegae, Hiromi; Tsujimoto, Hisashi; Kaga, Akito; Hirai, Masami; Nakazono, Mikio; Fujiwara, Toru; Iwata, Hiroyoshi.

Theor Appl Genet ; 137(4): 77, 2024 Mar 09.

Artigo em Inglês | MEDLINE | ID: mdl-38460027

RESUMO

KEY MESSAGE: We proposed models to predict the effects of genomic and environmental factors on daily soybean growth and applied them to soybean growth data obtained with unmanned aerial vehicles. Advances in high-throughput phenotyping technology have made it possible to obtain time-series plant growth data in field trials, enabling genotype-by-environment interaction (G × E) modeling of plant growth. Although the reaction norm is an effective method for quantitatively evaluating G × E and has been implemented in genomic prediction models, no reaction norm models have been applied to plant growth data. Here, we propose a novel reaction norm model for plant growth using spline and random forest models, in which daily growth is explained by environmental factors one day prior. The proposed model was applied to soybean canopy area and height to evaluate the influence of drought stress levels. Changes in the canopy area and height of 198 cultivars were measured by remote sensing using unmanned aerial vehicles. Multiple drought stress levels were set as treatments, and their time-series soil moisture was measured. The models were evaluated using three cross-validation schemes. Although accuracy of the proposed models did not surpass that of single-trait genomic prediction, the results suggest that our model can capture G × E, especially the latter growth period for the random forest model. Also, significant variations in the G × E of the canopy height during the early growth period were visualized using the spline model. This result indicates the effectiveness of the proposed models on plant growth data and the possibility of revealing G × E in various growth stages in plant breeding by applying statistical or machine learning models to time-series phenotype data.

Assuntos

Secas , Soja , Soja/genética , Melhoramento Vegetal , Genoma , Genômica/métodos

19.

Tutorial on survival modeling with applications to omics data.

Zhao, Zhi; Zobolas, John; Zucknick, Manuela; Aittokallio, Tero.

Bioinformatics ; 40(3)2024 Mar 04.

Artigo em Inglês | MEDLINE | ID: mdl-38445722

RESUMO

MOTIVATION: Identification of genomic, molecular and clinical markers prognostic of patient survival is important for developing personalized disease prevention, diagnostic and treatment approaches. Modern omics technologies have made it possible to investigate the prognostic impact of markers at multiple molecular levels, including genomics, epigenomics, transcriptomics, proteomics and metabolomics, and how these potential risk factors complement clinical characterization of patient outcomes for survival prognosis. However, the massive sizes of the omics datasets, along with their correlation structures, pose challenges for studying relationships between the molecular information and patients' survival outcomes. RESULTS: We present a general workflow for survival analysis that is applicable to high-dimensional omics data as inputs when identifying survival-associated features and validating survival models. In particular, we focus on the commonly used Cox-type penalized regressions and hierarchical Bayesian models for feature selection in survival analysis, which are especially useful for high-dimensional data, but the framework is applicable more generally. AVAILABILITY AND IMPLEMENTATION: A step-by-step R tutorial using The Cancer Genome Atlas survival and omics data for the execution and evaluation of survival models has been made available at https://ocbe-uio.github.io/survomics.

Assuntos

Genômica , Proteômica , Humanos , Teorema de Bayes , Genômica/métodos , Genoma , Epigenômica , Metabolômica

20.

Unlocking the Potential of Therapy-Induced Cytokine Responses: Illuminating New Pathways in Cancer Precision Medicine.

Gunturu, Dilip R; Hassan, Mohammed; Bedi, Deepa; Datta, Pran; Manne, Upender; Samuel, Temesgen.

Curr Oncol ; 31(3): 1195-1206, 2024 Feb 23.

Artigo em Inglês | MEDLINE | ID: mdl-38534922

RESUMO

Precision cancer medicine primarily aims to identify individual patient genomic variations and exploit vulnerabilities in cancer cells to select suitable patients for specific drugs. These genomic features are commonly determined by gene sequencing prior to therapy, to identify individuals who would be most responsive. This precision approach in cancer therapeutics remains a powerful tool that benefits a smaller pool of patients, sparing others from unnecessary treatments. A limitation of this approach is that proteins, not genes, are the ultimate effectors of biological functions, and therefore the targets of therapeutics. An additional dimension in precision medicine that considers an individual's cytokine response to cancer therapeutics is proposed. Cytokine responses to therapy are multifactorial and vary among individuals. Thus, precision is dictated by the nature and magnitude of cytokine responses in the tumor microenvironment exposed to therapy. This review highlights cytokine responses as modules for precision medicine in cancer therapy, including potential challenges. For solid tumors, both detectability of cytokines in tissue fluids and their being amenable to routine sensitive analyses could address the difficulty of specimen collection for diagnosis and monitoring. Therefore, in precision cancer medicine, cytokines offer rational targets that can be utilized to enhance the efficacy of cancer therapy.

Assuntos

Neoplasias , Medicina de Precisão , Humanos , Medicina de Precisão/métodos , Citocinas/uso terapêutico , Neoplasias/terapia , Genômica/métodos , Microambiente Tumoral

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA